254 research outputs found

    Nonbipartite Dulmage-Mendelsohn Decomposition for Berge Duality

    Full text link
    The Dulmage-Mendelsohn decomposition is a classical canonical decomposition in matching theory applicable for bipartite graphs, and is famous not only for its application in the field of matrix computation, but also for providing a prototypal structure in matroidal optimization theory. The Dulmage-Mendelsohn decomposition is stated and proved using the two color classes, and therefore generalizing this decomposition for nonbipartite graphs has been a difficult task. In this paper, we obtain a new canonical decomposition that is a generalization of the Dulmage-Mendelsohn decomposition for arbitrary graphs, using a recently introduced tool in matching theory, the basilica decomposition. Our result enables us to understand all known canonical decompositions in a unified way. Furthermore, we apply our result to derive a new theorem regarding barriers. The duality theorem for the maximum matching problem is the celebrated Berge formula, in which dual optimizers are known as barriers. Several results regarding maximal barriers have been derived by known canonical decompositions, however no characterization has been known for general graphs. In this paper, we provide a characterization of the family of maximal barriers in general graphs, in which the known results are developed and unified

    A User-Friendly Hybrid Sparse Matrix Class in C++

    Get PDF
    When implementing functionality which requires sparse matrices, there are numerous storage formats to choose from, each with advantages and disadvantages. To achieve good performance, several formats may need to be used in one program, requiring explicit selection and conversion between the formats. This can be both tedious and error-prone, especially for non-expert users. Motivated by this issue, we present a user-friendly sparse matrix class for the C++ language, with a high-level application programming interface deliberately similar to the widely used MATLAB language. The class internally uses two main approaches to achieve efficient execution: (i) a hybrid storage framework, which automatically and seamlessly switches between three underlying storage formats (compressed sparse column, coordinate list, Red-Black tree) depending on which format is best suited for specific operations, and (ii) template-based meta-programming to automatically detect and optimise execution of common expression patterns. To facilitate relatively quick conversion of research code into production environments, the class and its associated functions provide a suite of essential sparse linear algebra functionality (eg., arithmetic operations, submatrix manipulation) as well as high-level functions for sparse eigendecompositions and linear equation solvers. The latter are achieved by providing easy-to-use abstractions of the low-level ARPACK and SuperLU libraries. The source code is open and provided under the permissive Apache 2.0 license, allowing unencumbered use in commercial products

    Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures

    Full text link
    We study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of the coarse-grid solver on performance. We consider two algorithms for solving the coarse-space systems: the preconditioned conjugate gradient method and a new robust HSS-embedded low-rank sparse-factorization algorithm. Our test data comes from the SPE Comparative Solution Project for oil-reservoir simulations. We contrast the performance of our code on one 12-core socket of a Cray XC30 machine with performance on a 60-core Intel Xeon Phi coprocessor. To obtain top performance, we optimized the code to take full advantage of fine-grained parallelism and made it thread-friendly for high thread count. We also developed a bounds-and-bottlenecks performance model of the solver which we used to guide us through the optimization effort, and also carried out performance tuning in the solver’s large parameter space. As a result, significant speedups were obtained on both machines

    Fast interior point solution of quadratic programming problems arising from PDE-constrained optimization

    Get PDF
    Interior point methods provide an attractive class of approaches for solving linear, quadratic and nonlinear programming problems, due to their excellent efficiency and wide applicability. In this paper, we consider PDE-constrained optimization problems with bound constraints on the state and control variables, and their representation on the discrete level as quadratic programming problems. To tackle complex problems and achieve high accuracy in the solution, one is required to solve matrix systems of huge scale resulting from Newton iteration, and hence fast and robust methods for these systems are required. We present preconditioned iterative techniques for solving a number of these problems using Krylov subspace methods, considering in what circumstances one may predict rapid convergence of the solvers in theory, as well as the solutions observed from practical computations

    A Schur complement approach to preconditioning sparse linear least-squares problems with some dense rows

    Get PDF
    The effectiveness of sparse matrix techniques for directly solving large-scale linear least-squares problems is severely limited if the system matrix A has one or more nearly dense rows. In this paper, we partition the rows of A into sparse rows and dense rows (A s and A d ) and apply the Schur complement approach. A potential difficulty is that the reduced normal matrix AsTA s is often rank-deficient, even if A is of full rank. To overcome this, we propose explicitly removing null columns of A s and then employing a regularization parameter and using the resulting Cholesky factors as a preconditioner for an iterative solver applied to the symmetric indefinite reduced augmented system. We consider complete factorizations as well as incomplete Cholesky factorizations of the shifted reduced normal matrix. Numerical experiments are performed on a range of large least-squares problems arising from practical applications. These demonstrate the effectiveness of the proposed approach when combined with either a sparse parallel direct solver or a robust incomplete Cholesky factorization algorithm

    Fast and accurate protein substructure searching with simulated annealing and GPUs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate approaches are still required, and few current methods are capable of substructure (motif) searching.</p> <p>Results</p> <p>We developed an improved heuristic for tableau-based protein structure and substructure searching using simulated annealing, that is as fast or faster and comparable in accuracy, with some widely used existing methods. Furthermore, we created a parallel implementation on a modern graphics processing unit (GPU).</p> <p>Conclusions</p> <p>The GPU implementation achieves up to 34 times speedup over the CPU implementation of tableau-based structure search with simulated annealing, making it one of the fastest available methods. To the best of our knowledge, this is the first application of a GPU to the protein structural search problem.</p
    corecore